Safely Interruptible Agents

نویسندگان

Laurent Orseau

Stuart Armstrong

چکیده

Reinforcement learning agents interacting with a complex environment like the real world are unlikely to behave optimally all the time. If such an agent is operating in real-time under human supervision, now and then it may be necessary for a human operator to press the big red button to prevent the agent from continuing a harmful sequence of actions—harmful either for the agent or for the environment—and lead the agent into a safer situation. However, if the learning agent expects to receive rewards from this sequence, it may learn in the long run to avoid such interruptions, for example by disabling the red button— which is an undesirable outcome. This paper explores a way to make sure a learning agent will not learn to prevent (or seek!) being interrupted by the environment or a human operator. We provide a formal definition of safe interruptibility and exploit the off-policy learning property to prove that either some agents are already safely interruptible, like Q-learning, or can easily be made so, like Sarsa. We show that even ideal, uncomputable reinforcement learning agents for (deterministic) general computable environments can be made safely interruptible.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Interruptible Critical Sections

We present a new approach to synchronization on uniprocessors with special applicability to embedded and real-time systems. Existing methods for synchronization in real-time systems are pessimistic, and use blocking to enforce concurrency control. While protocols to bound the blocking of high priority tasks exist, high priority tasks can still be blocked by low priority tasks. In addition, thes...

متن کامل

Safely and Efficiently Updating References During On-line Reorganization

With today’s demands for continuous availability of mission-critical databases, on-line reorganization is a necessity. In this paper we present a new on-Iine reorganization algorithm which defers secondary index updates and piggybacks them with user transactions. In addition to the significant reduction of the total I/O cost, the algorithm also assures that almost all the database is available ...

متن کامل

On Interruptible Pure Exploration in Multi-Armed Bandits

Interruptible pure exploration in multi-armed bandits (MABs) is a key component of Monte-Carlo tree search algorithms for sequential decision problems. We introduce Discriminative Bucketing (DB), a novel family of strategies for pure exploration in MABs, which allows for adapting recent advances in non-interruptible strategies to the interruptible setting, while guaranteeing exponential-rate pe...

متن کامل

Interruptible Electricity Contracts from an Electricity Retailer's Point of View: Valuation and Optimal Interruption

We consider interruptible electricity contracts issued by an electricity retailer that allow for interruptions to electric service in exchange for either an overall reduction in the price of electricity delivered or for financial compensation at the time of interruption. We provide an equilibrium model to determine electricity prices based on stochastic models of supply and demand. In the conte...

متن کامل

Transmission congestion management in bilateral markets: An interruptible load auction solution

This paper demonstrates that appropriate invocation of interruptible loads by the independent system operator (ISO) can aid in relieving transmission congestion in power systems. An auction model is proposed, for an ISO operating in a bilateral contract dominated market, for real-time selection of interruptible load offers while satisfying the congestion management objective. The proposed conge...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2016

Safely Interruptible Agents

نویسندگان

چکیده

منابع مشابه

Interruptible Critical Sections

Safely and Efficiently Updating References During On-line Reorganization

On Interruptible Pure Exploration in Multi-Armed Bandits

Interruptible Electricity Contracts from an Electricity Retailer's Point of View: Valuation and Optimal Interruption

Transmission congestion management in bilateral markets: An interruptible load auction solution

عنوان ژورنال:

اشتراک گذاری